{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "8ea6b26a",
   "metadata": {},
   "source": [
    "## 8.1 AlexNet"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "06ec2104",
   "metadata": {},
   "source": [
    "论文可见于：[论文链接](https://dl.acm.org/doi/abs/10.1145/3065386)\n",
    "\n",
    "在上一章的最后详细介绍了卷积神经网络的开山之作LeNet，虽然卷积神经网络最近非常流程，但在LeNet出现后的十几年里，目标识别领域的主流还是传统目标识别算法，当然也与算力限制有关。直到2012年AlexNet赢得ImageNet挑战赛，才使得卷积神经网络再次引起人们的关注，也因此一发不可收拾，不断推陈出新。\n",
    "\n",
    "AlexNet是由Alex Krizhevsky、Ilya Sutskever和Geoffrey Hinton等人设计的。这个模型很大程度上推动了深度学习的发展，并为今天的卷积神经网络模型奠定了基础。截止目前为止，论文引用次数已经超过12万。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "77e15147",
   "metadata": {},
   "source": [
    "### 8.1.1 AlexNet基本思想"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2569d649",
   "metadata": {},
   "source": [
    "模型的设计是受到LeNet的启发的，但是它比LeNet要大得多，有5个卷积层和3个全连接层。它也引入了新的技术，例如使用ReLU激活函数代替Sigmoid激活函数，使用Dropout正则化以防止过拟合等。 AlexNet还是第一个在大型数据集上训练的卷积神经网络模型，它在ImageNet数据集上取得了很好的结果，并且在计算机视觉领域中引起了很大的关注。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "545c04bb",
   "metadata": {},
   "source": [
    "### 8.1.2 AlexNet结构"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3a505f2b",
   "metadata": {},
   "source": [
    "AlexNet结构如下图所示：\n",
    "\n",
    "<img src=\"./images/8-1-1.png\" width=\"100%\"></img>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9778d423",
   "metadata": {},
   "source": [
    "AlexNet是一个8层的卷积神经网络，其中包括5个卷积层和3个全连接层。\n",
    "\n",
    "* 第一层是卷积层，用于提取图像的特征。它包括96个11x11步长为4的卷积核，并使用最大池化。\n",
    "\n",
    "* 第二层是卷积层，包括256个5x5的卷积核，并使用最大池化。\n",
    "\n",
    "* 第三、四、五层也是卷积层，分别包括384个3x3的卷积核，384个3x3的卷积核，和256个3x3的卷积核。这三层后面使用最大池化。\n",
    "\n",
    "* 第六层是全连接层，包括4096个神经元。\n",
    "\n",
    "* 第七层是全连接层，包括4096个神经元。\n",
    "\n",
    "* 第八层是输出层，包括1000个神经元，用于预测图像属于ImageNet数据集中的哪一类。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b4f46c5e",
   "metadata": {},
   "source": [
    "AlexNet参数量计算参考下图：\n",
    "\n",
    "<img src=\"./images/8-1-2.png\" width=\"100%\"></img>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "061374ac",
   "metadata": {},
   "source": [
    "### 8.1.3 AlexNet代码实现"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "96ff5372",
   "metadata": {},
   "source": [
    "接下来看一下如何用代码实现相关网络结构。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "057ce823",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 导入必要的库\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "\n",
    "# 定义AlexNet的网络结构\n",
    "class AlexNet(nn.Module):\n",
    "    def __init__(self, num_classes=1000, dropout=0.5):\n",
    "        super().__init__()\n",
    "        # 定义卷积层\n",
    "        self.features = nn.Sequential(\n",
    "            nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.MaxPool2d(kernel_size=3, stride=2),\n",
    "            nn.Conv2d(64, 192, kernel_size=5, padding=2),\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.MaxPool2d(kernel_size=3, stride=2),\n",
    "            nn.Conv2d(192, 384, kernel_size=3, padding=1),\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.Conv2d(384, 256, kernel_size=3, padding=1),\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.Conv2d(256, 256, kernel_size=3, padding=1),\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.MaxPool2d(kernel_size=3, stride=2),\n",
    "        )\n",
    "        # 定义全连接层\n",
    "        self.classifier = nn.Sequential(\n",
    "            nn.Dropout(p=dropout),\n",
    "            nn.Linear(256 * 6 * 6, 4096),\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.Dropout(p=dropout),\n",
    "            nn.Linear(4096, 4096),\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.Linear(4096, num_classes),\n",
    "        )\n",
    "\n",
    "    def forward(self, x):\n",
    "        x = self.features(x)\n",
    "        x = torch.flatten(x, 1)\n",
    "        x = self.classifier(x)\n",
    "        return x"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "afd6c1a0",
   "metadata": {},
   "source": [
    "### 8.1.4 小结"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14d67b16",
   "metadata": {},
   "source": [
    "* AlexNet是当时第一个在大型数据集上训练的卷积神经网络模型，使得卷积神经网络成为计算机视觉领域中可行的机器学习方法。\n",
    "\n",
    "* 使用了ReLU激活函数，这使得训练更快且模型性能更好。\n",
    "\n",
    "* 使用Dropout正则化，有利于防止过拟合。\n",
    "\n",
    "* 在数据集上做了很多工作，引入数据增强，例如镜像以及随机剪裁等。\n",
    "\n",
    "* 采用了一种局部响应归一化（Local Response Normalization, LRN）的方法进行处理。当然后面VGGNet等证明LRN并没有实现其预期效果，这里就不做赘述了。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
